AITopics | morphological complexity

Collaborating Authors

morphological complexity

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On the Interplay between Positional Encodings, Morphological Complexity, and Word Order Flexibility

Tatariya, Kushal, Poelman, Wessel, de Lhoneux, Miryam

arXiv.org Artificial IntelligenceNov-12-2025

Language model architectures are predominantly first created for English and subsequently applied to other languages. It is an open question whether this architectural bias leads to degraded performance for languages that are structurally different from English. We examine one specific architectural choice: positional encodings, through the lens of the trade-off hypothesis: the supposed interplay between morphological complexity and word order flexibility. This hypothesis posits a trade-off between the two: a more morphologically complex language can have a more flexible word order, and vice-versa. Positional encodings are a direct target to investigate the implications of this hypothesis in relation to language modelling. We pretrain monolingual model variants with absolute, relative, and no positional encodings for seven typologically diverse languages and evaluate them on four downstream tasks. Contrary to previous findings, we do not observe a clear interaction between position encodings and morphological complexity or word order flexibility, as measured by various proxies. Our results show that the choice of tasks, languages, and metrics are essential for drawing stable conclusions

computational linguistic, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2511.08139

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Linguistic and Embedding-Based Profiling of Texts generated by Humans and Large Language Models

Zanotto, Sergio E., Aroyehun, Segun

arXiv.org Artificial IntelligenceOct-1-2025

The rapid advancements in large language models (LLMs) have significantly improved their ability to generate natural language, making texts generated by LLMs increasingly indistinguishable from human-written texts. While recent research has primarily focused on using LLMs to classify text as either human-written or machine-generated texts, our study focuses on characterizing these texts using a set of linguistic features across different linguistic levels such as morphology, syntax, and semantics. We select a dataset of human-written and machine-generated texts spanning 8 domains and produced by 11 different LLMs. We calculate different linguistic features such as dependency length and emotionality, and we use them for characterizing human-written and machine-generated texts along with different sampling strategies, repetition controls, and model release dates. Our statistical analysis reveals that human-written texts tend to exhibit simpler syntactic structures and more diverse semantic content. Furthermore, we calculate the variability of our set of features across models and domains. Both human- and machine-generated texts show stylistic diversity across domains, with human-written texts displaying greater variation in our features. Finally, we apply style embeddings to further test variability among human-written and machine-generated texts. Notably, newer models output text that is similarly variable, pointing to a homogenization of machine-generated texts.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2507.13614

Country:

Europe (1.00)
Asia (0.68)
North America > United States (0.28)
North America > Mexico > Mexico City (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Media > News (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Add feedback

The Morphology-Control Trade-Off: Insights into Soft Robotic Efficiency

Xie, Yue, Chu, Kai-feng, Wang, Xing, Iida, Fumiya

arXiv.org Artificial IntelligenceMar-20-2025

Soft robotics holds transformative potential for enabling adaptive and adaptable systems in dynamic environments. However, the interplay between morphological and control complexities and their collective impact on task performance remains poorly understood. Therefore, in this study, we investigate these trade-offs across tasks of differing difficulty levels using four well-used morphological complexity metrics and control complexity measured by FLOPs. We investigate how these factors jointly influence task performance by utilizing the evolutionary robot experiments. Results show that optimal performance depends on the alignment between morphology and control: simpler morphologies and lightweight controllers suffice for easier tasks, while harder tasks demand higher complexities in both dimensions. In addition, a clear trade-off between morphological and control complexities that achieve the same task performance can be observed. Moreover, we also propose a sensitivity analysis to expose the task-specific contributions of individual morphological metrics. Our study establishes a framework for investigating the relationships between morphology, control, and task performance, advancing the development of task-specific robotic designs that balance computational efficiency with adaptability. This study contributes to the practical application of soft robotics in real-world scenarios by providing actionable insights.

artificial intelligence, machine learning, reinforcement learning, (21 more...)

arXiv.org Artificial Intelligence

2503.16127

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Oceania > Australia > Australian Capital Territory > Canberra (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)
(2 more...)

Add feedback

Tokenization and Morphology in Multilingual Language Models: A Comparative Analysis of mT5 and ByT5

Dang, Thao Anh, Raviv, Limor, Galke, Lukas

arXiv.org Artificial IntelligenceOct-21-2024

Morphology is a crucial factor for multilingual language modeling as it poses direct challenges for tokenization. Here, we seek to understand how tokenization influences the morphological knowledge encoded in multilingual language models. Specifically, we capture the impact of tokenization by contrasting two multilingual language models: mT5 and ByT5. The two models share the same architecture, training objective, and training data and only differ in their tokenization strategies: subword tokenization vs.\@ character-level tokenization. Probing the morphological knowledge encoded in these models on four tasks and 17 languages, our analyses show that the models learn the morphological systems of some languages better than others and that morphological information is encoded in the middle and late layers. Finally, we show that languages with more irregularities benefit more from having a higher share of the pre-training data.

computational linguistic, language model, proceedings, (13 more...)

arXiv.org Artificial Intelligence

2410.11627

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
Europe > Denmark > Southern Denmark (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

A Morphology-Based Investigation of Positional Encodings

Ghosh, Poulami, Vashishth, Shikhar, Dabre, Raj, Bhattacharyya, Pushpak

arXiv.org Artificial IntelligenceMay-30-2024

Contemporary deep learning models effectively handle languages with diverse morphology despite not being directly integrated into them. Morphology and word order are closely linked, with the latter incorporated into transformer-based models through positional encodings. This prompts a fundamental inquiry: Is there a correlation between the morphological complexity of a language and the utilization of positional encoding in pre-trained language models? In pursuit of an answer, we present the first study addressing this question, encompassing 22 languages and 5 downstream tasks. Our findings reveal that the importance of positional encoding diminishes with increasing morphological complexity in languages. Our study motivates the need for a deeper understanding of positional encoding, augmenting them to better reflect the different languages under consideration.

computational linguistic, morphological complexity, word order, (12 more...)

arXiv.org Artificial Intelligence

2404.0453

Country:

Europe > Italy > Tuscany > Florence (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Asia > India (0.04)
(9 more...)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Still no evidence for an effect of the proportion of non-native speakers on language complexity -- A response to Kauhanen, Einhaus & Walkden (2023)

Koplenig, Alexander

arXiv.org Artificial IntelligenceMay-30-2023

In a recent paper published in the Journal of Language Evolution, Kauhanen, Einhaus & Walkden (https://doi.org/10.1093/jole/lzad005, KEW) challenge the results presented in one of my papers (Koplenig, Royal Society Open Science, 6, 181274 (2019), https://doi.org/10.1098/rsos.181274), in which I tried to show through a series of statistical analyses that large numbers of L2 (second language) speakers do not seem to affect the (grammatical or statistical) complexity of a language. To this end, I focus on the way in which the Ethnologue assesses language status: a language is characterised as vehicular if, in addition to being used by L1 (first language) speakers, it should also have a significant number of L2 users. KEW criticise both the use of vehicularity as a (binary) indicator of whether a language has a significant number of L2 users and the idea of imputing a zero proportion of L2 speakers to non-vehicular languages whenever a direct estimate of that proportion is unavailable. While I recognise the importance of post-publication commentary on published research, I show in this rejoinder that both points of criticism are explicitly mentioned and analysed in my paper. In addition, I also comment on other points raised by KEW and demonstrate that both alternative analyses offered by KEW do not stand up to closer scrutiny.

artificial intelligence, machine learning, proportion, (16 more...)

arXiv.org Artificial Intelligence

2305.00217

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Texas > Brazos County > College Station (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.66)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Machine learning improves Arabic speech transcription capabilities

MIT Technology ReviewNov-24-2021, 14:20:07 GMT

Thanks to advancements in speech and natural language processing, there is hope that one day you may be able to ask your virtual assistant what the best salad ingredients are. Currently, it is possible to ask your home gadget to play music, or open on voice command, which is a feature already found in some many devices. If you speak Moroccan, Algerian, Egyptian, Sudanese, or any of the other dialects of the Arabic language, which are immensely varied from region to region, where some of them are mutually unintelligible, it is a different story. If your native tongue is Arabic, Finnish, Mongolian, Navajo, or any other language with high level of morphological complexity, you may feel left out. These complex constructs intrigued Ahmed Ali to find a solution.

arabic, arabic speech transcription capability, dialect, (10 more...)

MIT Technology Review

AI-Alerts: 2021 > 2021-11 > AAAI AI-Alert for Nov 30, 2021 (1.00)

Country:

Africa > Sudan (0.25)
Asia > Middle East > Qatar (0.07)
Africa > Middle East > Egypt > Aswan Governorate > Aswan (0.05)

Genre: Instructional Material (0.31)

Industry:

Education > Educational Setting > Online (0.50)
Education > Educational Technology > Educational Software > Computer Based Training (0.31)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.71)

Add feedback